3D rendering as functional reactive programming
I’ve been playing with a simple/general semantics for 3D. In the process, I was surprised to see that a key part of the semantics looks exactly like a key part of the semantics of functional reactivity as embodied in the library Reactive. A closer look revealed a closer connection still, as described in this post.
What is 3D rendering?
Most programmers think of 3D rendering as being about executing sequence of side-effects on frame buffer or some other mutable array of pixels. This way of thinking (sequences of side-effects) comes to us from the design of early sequential computers. Although computer hardware architecture has evolved a great deal, most programming languages, and hence most programming thinking, is still shaped by the this first sequential model. (See John Backus’s Turing Award lecture Can Programming Be Liberated from the von Neumann Style? A functional style and its algebra of programs.) The invention of monadic Imperative functional programming allows Haskellers to think and program within the imperative paradigm as well.
What’s a functional alternative?
Rendering is a function from something to something else.
Let’s call these somethings (3D) “Geometry” and (2D) “Image”, where Geometry
and Image
are types of functional (immutable) values.
type Rendering = Image Color
render :: Geometry -> Rendering
To simplify, I’m assuming a fixed view. What remains is to define what these two types mean and, secondarily, how to represent and implement them.
An upcoming post will suggest an answer for the meaning of Geometry
.
For now, think of it as a collection of curved and polygonal surfaces, i.e., the outsides (boundaries) of solid shapes.
Each point on these surfaces has a location, a normal (perpendicular direction), and material properties (determining how light is reflected by and transmitted through the surface at the point).
The geometry will contain light sources.
Next, what is the meaning of Image
?
A popular answer is that an image is a rectangular array of finite-precision encodings of color (e.g., with eight bits for each of red, blue, green and possibly opacity).
This answer is leads to poor compositionality and complex meanings for operations like scaling and rotation, so I prefer another model.
As in Pan, an image (the meaning of the type Image Color
) is a function from infinite continuous 2D space to colors, where the Color
type includes partial opacity.
For motivation of this model and examples of its use, see Functional images and the corresponding Pan gallery of functional images.
Composition occurs on infinite & continuous images.
After all composition is done, the resulting image can be sampled into a finite, rectangular array of finite precision color encodings. I’m talking about a conceptual/semantic pipeline. The implementation computes the finite sampling without having to compute the values for entire infinite image.
Rendering has several components. I’ll just address one and show how it relates to functional reactive programming (FRP).
Visual occlusion
One aspect of 3D rendering is hidden surface determination. Relative to the viewer’s position and orientation, some 3D objects may fully or partially occluded by nearer objects.
An image is a function of (infinite and continuous) 2D space, so specifying that function is determining its value at every sample point. Each point can correspond to a number of geometric objects, some closer and some further. If we assume for now that our colors are fully opaque, then we’ll need to know the color (after transformation and lighting) of the nearest surface point that is projected onto the sample point. (We’ll remove this opacity assumption later.)
Let’s consider how we’ll combine two Geometry
values into one:
union :: Geometry -> Geometry -> Geometry
Because of occlusion, the render
function cannot be compositional with respect to union
.
If it were, then there would exist a functions unionR
such that
forall ga gb. render (ga `union` gb) == render ga `unionR` render gb
In other words, to render a union of two geometries, we can render each and combine the results.
The reason we can’t find such a unionR
is that render
doesn’t let unionR
know how close each colored point is.
A solution then is simple: add in the missing depth information:
type RenderingD = Image (Depth, Color) -- first try
renderD :: Geometry -> RenderingD
Now we have enough information for compositional rendering, i.e., we can define unionR
such that
forall ga gb. renderD (ga `union` gb) == renderD ga `unionR` renderD gb
where
unionR :: RenderingD -> RenderingD -> RenderingD
unionR im im' p = if d <= d' then (d,c) else (d',c')
where
(d ,c ) = im p
(d',c') = im' p
When we’re done composing, we can discard the depths:
render g = snd . renderD g
or, with Semantic editor combinators:
render = (result.result) snd renderD
Simpler, prettier
The unionR
is not very complicated, but still, I like to tease out common structure and reuse definitions wherever I can.
The first thing I notice about unionR
is that it works pointwise.
That is, the value at a point is a function of the values of two other images at the same point.
The pattern is captured by liftA2
on functions, thanks to the Applicative
instance for functions.
liftA2 :: (b -> c -> d) -> (a -> b) -> (a -> c) -> (a -> d)
So that
unionR = liftA2 closer
closer (d,c) (d',c') = if d <= d' then (d,c) else (d',c')
Or
closer dc@(d,_) dc'@(d',_) = if d <= d' then dc else dc'
Or even
closer = minBy fst
where
minBy f u v = if f u <= f v then u else v
This definition of unionR
is not only simpler, it’s quite a bit more general, as type inference reveals:
unionR :: (Ord a, Applicative f) => f (a,b) -> f (a,b) -> f (a,b)
closer :: Ord a => (a,b) -> (a,b) -> (a,b)
Once again, simplicity and generality go hand-in-hand.
Another type class morphism
Let’s see if we can make union
rendering simpler and more inevitable.
Rendering is nearly a homomorphism.
That is, render
nearly distributes over union
, but we have to replace union
by unionR
.
I’d rather eliminate this discrepancy, ending up with
forall ga gb. renderD (ga `op` gb) == renderD ga `op` renderD gb
for some op
that is equal to union
on the left and unionR
on the right.
Since union
and unionR
have different types (with neither being a polymorphic instance of the other), op
will have to be a method of some type class.
My favorite binary method is mappend
, from Monoid
, so let’s give it a try.
Monoid
requires there also to be an identity element mempty
and that mappend
be associative.
For Geometry
, we can define
instance Monoid Geometry where
mempty = emptyGeometry
mappend = union
Images with depth are a little trickier.
Image already has a Monoid
instance, whose semantics is determined by the principle of type class morphisms, namely
The meaning of an instance is the instance of the meaning
The meaning of an image is a function, and that functions have a Monoid
instance:
instance Monoid b => Monoid (a -> b) where
mempty = const mempty
f `mappend` g = a -> f a `mappend` g a
which simplifies nicely to a standard form, by using the Applicative
instance for functions.
instance Applicative ((->) a) where
pure = const
hf <*> xf = a -> (hf a) (xf a)
instance Monoid b => Monoid (a -> b) where
mempty = pure mempty
mappend = liftA2 mappend
We’re in luck.
Since we’ve defined unionR
as liftA2 closer
, so we just need it to turn out that closer == mappend
and that closer
is associative and has an identity element.
However, closer
is defined on pairs, and the standard Monoid
instance on pairs doesn’t fit.
instance (Monoid a, Monoid b) => Monoid (a,b) where
mempty = (mempty,mempty)
(a,b) `mappend` (a',b') = (a `mappend` a', b `mappend` b')
To avoid this conflict, define a new data type to be used in place of pairs.
data DepthG d a = Depth d a -- first try
Alternatively,
newtype DepthG d a = Depth { unDepth :: (d,a) }
I’ll go with this latter version, as it turns out to be more convenient.
Then we can define our monoid:
instance Monoid (DepthG d a) where
mempty = Depth (maxBound,undefined)
Depth p `mappend` Depth p' = Depth (p `closer` p')
The second method definition can be simplified nicely
mappend = inDepth2 closer
where
inDepth2 = unDepth ~> unDepth ~> Depth
using the ideas from Prettier functions for wrapping and wrapping and the notational improvement from Matt Hellige’s Pointless fun.
FRP — Future values
The Monoid
instance for Depth
may look familiar to you if you’ve been following along with my future values or have read the paper Simply efficient functional reactivity.
A future value has a time and a value.
Usually, the value cannot be known until its time arrives.
newtype FutureG t a = Future (Time t, a)
instance (Ord t, Bounded t) => Monoid (FutureG t a) where
mempty = Future (maxBound, undefined)
Future (s,a) `mappend` Future (t,b) =
Future (s `min` t, if s <= t then a else b)
When we’re using a non-lazy (flat) representation of time, this mappend
definition can be written more simply:
mappend = minBy futTime
futTime (Future (t,_)) = t
Equivalently,
mappend = inFuture2 (minBy fst)
The Time
type is really nothing special about time.
It is just a synonym for the Max
monoid, as needed for the Applicative
and Monad
instances.
This connection with future values means we can discard more code.
type RenderingD d = Image (FutureG d Color)
renderD :: (Ord d, Bounded d) => Geometry -> RenderingD d
Now we have our monoid (homo)morphism properties:
renderD mempty == mempty
renderD (ga `mappend` gb) == renderD ga `mappend` renderD gb
And we’ve eliminated the code for renderR
by reusing and existing type (future values).
Future values?
What does it mean to think about depth/color pairs as being “future” colors?
If we were to probe outward along a ray, say at the speed of light, we would bump into some number of 3D objects.
The one we hit earliest is the nearest, so in this sense, mappend
on futures (choosing the earlier one) is the right tool for the job.
I once read that a popular belief in the past was that vision (light) reaches outward to strike objects, as I’ve just described. I’ve forgotten where I read about that belief, though I think in a book about perspective, and I’d appreciate a pointer from someone else who might have a reference.
We moderns believe that light travels to us from the objects we see. What we see of nearby objects comes from the very recent past, while of further objects we see the more remote past. From this modern perspective, therefore, the connection I’ve made with future values is exactly backward. Now that I think about it in this way, of course it’s backward, because we see (slightly) into the past rather than the future.
Fixing this conceptual flaw is simple: define a type of “past values”. Give them exactly the same representation as future values, and deriving its class instances entirely.
newtype PastG t a = Past (FutureG t a)
deriving (Monoid, Functor, Applicative, Monad)
Alternatively, choose a temporally neutral replacement for the name “future values”.
The bug in Z-buffering
The renderD
function implements continuous, infinite Z-buffering, with mappend
performing the z-compare and conditional overwrite.
Z-buffering is the dominant algorithm used in real-time 3D graphics and is supported in hardware on even low-end graphics hardware (though not in its full continuous and infinite glory).
However, Z-buffering also has a serious bug: it is only correct for fully opaque colors.
Consider a geometry g
and a point p
in the domain of the result image.
There may be many different points in g
that project to p
.
If g
has only fully opaque colors, then at most one place on g
contributes to the rendered image at p
, and specifically, the nearest such point.
If g
is the union
(mappend
) of two other geometries, g == ga `union` gb
, then the nearest contribution of g
(for p
) will be the nearer (mappend
) of the nearest contributions of ga
and of gb
.
When colors may be partially opaque, the color of the rendering at a point p
can depend on all of the points in the geometry that get projected to p
.
Correct rendering in the presence of partial opacity requires a fold
that combines all of the colors that project onto a point, in order of distance, where the color-combining function (alpha-blending) is not commutative.
Consider again g == ga `union` gb
.
The contributions of ga
to p
might be entirely closer than the contributions of gb
, or entirely further, or interleaved.
If interleaved, then the colors generated from each cannot be combined into a single color for further combination.
To handle the general case, replace the single distance/color pair with an ordered collection of them:
type RenderingD d = Image [FutureG d Color] -- multiple projections, first try
Rendering a union
(mappend
) requires a merging of two lists of futures (distance/color pairs) into a single one.
More FRP — Events
Sadly, we’ve now lost our monoid morphism, because list mappend
is (++)
, not the required merging.
However, we can fix this problem as we did before, by introducing a new type.
Or, we can look for an existing type that matches our required semantics.
There is just such a thing in the Reactive formulation of FRP, namely an event.
We can simply use the FRP Event
type:
type RenderingD d = Image (EventG d Color)
renderD :: (Ord d, Bounded d) => Geometry -> RenderingD d
Spatial transformation
Introducing depths allowed rendering to be defined compositionally with respect to geometric union.
Is the depth model, enhanced with lists (events), sufficient for compositionality of rendering with respect to other Geometry
operations as well?
Let’s look at spatial transformation.
(*%) :: Transform3 -> Geometry -> Geometry
Compositionally of rendering would mean that we can render xf *% g
by rendering g
and then using xf
in some way to transform that rendering.
In other words there would have to exist a function (*%%)
such that
forall xf g. renderD (xf *% g) == xf *%% renderD g
I don’t know if the required (*%%)
function exists, or what restrictions on Geometry
or Transform3
it implies, or whether such a function could be useful in practice.
Instead, let’s change the type of renderings again, so that rendering can accumulate transformations and apply them to surfaces.
type RenderingDX = Transform3 -> RenderingD
renderDX :: (Ord d, Bounded d) => Geometry -> RenderingDX d
with or without correct treatment of partial opacity (i.e., using futures or events).
This new function has a simple specification:
renderDX g xf == renderD (xf *% g)
from which it follows that
renderD g == renderDX g identityX
Rendering a transformed geometry then is a simple accumulation, justified as follows:
renderDX (xfi *% g)
== {- specification of renderDX -}
xfo -> renderD (xfo *% (xfi *% g))
== {- property of transformation -}
xfo -> renderD ((xfo `composeX` xfi) *% g)
== {- specification of renderDX -}
xfo -> renderDX g (xfo `composeX` xfi)
Render an empty geometry:
renderDX mempty
== {- specification of renderDX -}
xf -> renderD (xf *% mempty)
== {- property of (*%) and mempty -}
xf -> renderD mempty
== {- renderD is a monoid morphism -}
xf -> mempty
== {- definition of pure on functions -}
pure mempty
== {- definition of mempty on functions -}
mempty
Render a geometric union:
renderDX (ga `mappend` gb)
== {- specification of renderDX -}
xf -> renderD (xf *% (ga `mappend` gb))
== {- property of transformation and union -}
xf -> renderD ((xf *% ga) `mappend` (xf *% gb))
== {- renderD is a monoid morphism -}
xf -> renderD (xf *% ga) `mappend` renderD (xf *% gb)
== {- specification of renderDX -}
xf -> renderDX ga xf `mappend` renderDX gb xf
== {- definition of liftA2/(<*>) on functions -}
liftA2 mappend (renderDX ga) (renderDX gb)
== {- definition of mappend on functions -}
renderDX ga `mappend` renderDX gb
Hurray!
renderDX
is still a monoid morphism.
The two properties of transformation and union used above say together that (xf *%)
is a monoid morphism for all transforms xf
.
Pseudonym:
Little-appreciated fact: Rays (as in ray tracing) are the adjoints of photons.
11 January 2009, 10:26 pmDougal Stanton:
I’m currently reading Steven Pinker’s The Stuff of Thought. He devotes lengthy sections to the discussion of repurposing basic metaphors for more abstract domains. One of the simplest he describes is using metaphors of space to describe time (time as a journey, time as a flowing river, looking into the past or the future). It’s interesting that you’re basically doing the opposite here: you already had the temporal metaphors (well, functions) worked out and you’re reusing them to describe three-dimensional space.
12 January 2009, 2:22 amMichael Robinson:
This is a clever idea, and an enjoyable post! You could extend it to permit accounting for more complicated phenomena, like reflections or diffraction by changing your Time type to be a rooted tree, which of course has a notion of partial ordering rather than a total ordering as in Ord. Metaphorically, the root of such a tree lies in the Image surface, each edge is marked with a distance/propagation time, and each node in the tree is on a reflective portion of your Geometry. It seems this might mesh well with your notion of interactive FRP.
13 January 2009, 12:02 pmconal:
Hi Michael. I love your idea for extending the FRP abstractions in ways that account for reflections, diffraction etc, and then seeing what these extended abstractions suggest for FRP. Thanks!
13 January 2009, 7:36 pmArtyom Shalkhakov:
An enjoyable post, thanks. The functional approach seems very elegant.
I haven’t looked into Reactive and FieldTrip, are describing them?
PS when will Reactive be ready for usage in real applications?
13 January 2009, 8:48 pmconal:
Hi Artyom. Thanks for the feedback. Yes, I’m talking about Reactive as an approach to FRP that is built around the notion of future values. Also about FieldTrip, though I haven’t yet applied my new semantic ideas in the library.
About Reactive and real applications, I think it’s getting quite close. There’s been a persistent troubling bug that now appears to be rooted in a concurrency bug in the GHC’s run-time system. That bug has been fixed, and once I can get a new stable version, I’ll better know how close we are. I’ve also been exploring changes to Reactive’s programming model that I think will make FRP more well-suited to space-time modular user-interaction. You can find hints about these changes in some of my recent blog posts.
13 January 2009, 9:23 pmPeter Verswyvelen:
A bit off topic. You write:
“Each point on these surfaces has a location, a normal (perpendicular direction), and material properties (determining how light is reflected by and transmitted through the surface at the point).”
In practice, points also need tangents and bitangents for certain types of lighting. What are your thoughts on that? See e.g. http://www.terathon.com/code/tangent.html
16 January 2009, 4:53 pmconal:
Thanks, Peter. I’d forgotten that normals aren’t enough for all lighting models. An easy solution is to throw away less information. In place of location and normal, we might have location and first derivative (full, i.e., both partial derivatives), or a full (infinite) derivative tower.
17 January 2009, 9:25 amandreas:
Hi conal. i am really excited about your interesting post. So i was thinking about a small example where you don’t need the color but just the depth value like ShadowMapping. Semantically something like
And a Light that uses the image (creating a dependency).
What are your thoughts on that? Could that also be described ?
20 January 2009, 10:43 amnewsham:
So (stretching a metaphor) the future occludes the more distant future in FRP?
20 January 2009, 10:59 amconal:
@newsham – Yes, that’s the idea: futures are opaque. Events are partially transparent. For fancier rendering, using transmission through 3D media (air, jello, etc), the (per-ray) model is analogous to (continuous-time) behaviors with integration.
20 January 2009, 11:28 am